Skip to content

Conversation

@harshaljanjani
Copy link
Contributor

@harshaljanjani harshaljanjani commented Jan 17, 2026

What does this PR do?

The following documentation improvements are made in this PR:

  1. Added docstring notes to num_sparse_encoder_layers and num_sparse_decoder_layers parameter SwitchTransformersConfig explaining that when set to 0 with a single layer model, the current implementation may still create a sparse layer due to the sparse step calculation. This edge case is not encountered in existing checkpoints.
  2. Updated an outdated comment in audio_utils.py that stated spectrogram() does not support batching, even though spectrogram_batch() already exists. Changed to a note referencing the batch function.

Fixes #43335.

Before submitting

cc: @Rocketknight1

@harshaljanjani harshaljanjani marked this pull request as ready for review January 17, 2026 13:48
@harshaljanjani
Copy link
Contributor Author

The failing tests are unrelated to this change; would appreciate a review when you get a chance thanks!

@harshaljanjani harshaljanjani changed the title fix: Correct Switch Transformers sparse layer logic and outdated spectrogram comment docs: Add Switch Transformers docstring notes and update spectrogram comment Jan 20, 2026
@harshaljanjani
Copy link
Contributor Author

The failing tests are unrelated to the change; the PR is ready for review, thanks!

Copy link
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@Rocketknight1 Rocketknight1 force-pushed the fix/switch-transformers-sparse-layer branch from ed76642 to a7727fb Compare January 21, 2026 13:49
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Rocketknight1
Copy link
Member

CI is red because of some PyTorch 2.10 incompatibilities, we'll pin 2.9 and try rerunning soon!

@harshaljanjani
Copy link
Contributor Author

CI is red because of some PyTorch 2.10 incompatibilities, we'll pin 2.9 and try rerunning soon!

Sounds great, thanks!

@Rocketknight1 Rocketknight1 force-pushed the fix/switch-transformers-sparse-layer branch from a7727fb to 4ddb4fb Compare January 21, 2026 15:06
@github-actions
Copy link
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43336&sha=4ddb4f

@Rocketknight1 Rocketknight1 force-pushed the fix/switch-transformers-sparse-layer branch from 4ddb4fb to dc36a63 Compare January 22, 2026 14:14
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: switch_transformers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] SwitchTransformersConfig creates sparse layer when num_sparse_encoder_layers=0 with single layer model

3 participants